CAUSA 2.0: accurate and consistent evolutionary analysis of proteins using codon and amino acid unified sequence alignments
نویسندگان
چکیده
Multiple sequence alignment (MSA) is widely used to reveal structural and functional changes leading to genetic differences among species, and to reconstruct evolutionary histories of related genes, proteins and genomes. Traditionally, proteins and their coding sequences (CDSs) are aligned and analyzed separately, but often drastically different conclusions were drawn on a same set of data. Here we present a new alignment strategy, Codon and Amino Acid Unified Sequence Alignment (CAUSA) 2.0, which aligns proteins and their coding sequences simultaneously. CAUSA 2.0 optimizes the alignment of CDSs at both codon and amino acid level efficiently. Theoretical analysis showed that CAUSA 2.0 enhances the entropy information content of MSA. Empirical data analysis demonstrated that CAUSA 2.0 is more accurate and consistent than nucleotide, protein or codon level alignments. CAUSA 2.0 locates in-frame indels more accurately, makes the alignment of coding sequences biologically more significant, and reveals several novel mutation mechanisms that relate to some genetic diseases. CAUSA 2.0 is available in website www.DNAPlusPro.com .
منابع مشابه
Accurate Reconstruction of Molecular Phylogenies for Proteins Using Codon and Amino Acid Unified Sequence Alignments (CAUSA)
متن کامل
An Evolutionary Relationship Between Stearoyl-CoA Desaturase (SCD) Protein Sequences Involved in Fatty Acid Metabolism
Background: Stearoyl-CoA desaturase (SCD) is a key enzyme that converts saturated fatty acids (SFAs) to monounsaturated fatty acids (MUFAs) in fat biosynthesis. Despite being crucial for interpreting SCDs’ roles across species, the evolutionary relationship of SCD proteins across species has yet to be elucidated. This study aims to present this evolutionary relationship based on amino aci...
متن کاملCalculating site-specific evolutionary rates at the amino-acid or codon level yields similar rate estimates
Site-specific evolutionary rates can be estimated from codon sequences or from amino-acid sequences. For codon sequences, the most popular methods use some variation of the dN∕dS ratio. For amino-acid sequences, one widely-used method is called Rate4Site, and it assigns a relative conservation score to each site in an alignment. How site-wise dN∕dS values relate to Rate4Site scores is not known...
متن کاملMolecular Characterization of a Three-disulfide Bridges Beta-like Neurotoxin from Androctonus crassicauda Scorpion Venom
Scorpion venom is the richest source of peptide toxins with high levels of specific interactions with different ion-channel membrane proteins. The present study involved the amplification and sequencing of a 310-bp cDNA fragment encoding a beta-like neurotoxin active on sodium ion-channel from the venom glands of scorpion Androctonus crassicauda belonging to the Buthidae family using r...
متن کاملInvestigation of Solvent Effect on CUA Codon Mutation: NMR Shielding Study
P53 is one of the gene that has important role in human cell cycle and in the human cancers too.Models of codon substitution make it possible to separate mutational biases in the DNA fromselective constraints on the protein, and offer a great advantage over amino acid models forunderstanding the evolutionary process of proteins and protein-coding DNA sequences. In thiswork, we investigated abou...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- PeerJ PrePrints
دوره 3 شماره
صفحات -
تاریخ انتشار 2015